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^ ■ Abstract 

The autoregressive moving average (ARMA) model is one of the most important models in 

H - 

. time series analysis. We consider the Bayesian estimation of an unknown spectral density in the 

' ARMA model. In the i.i.d. cases, Komaki showed that Bayesian predictive densities based on a 



superharmonic prior asymptotically dominate those based on the Jeffreys prior It is shown by 
using the asymptotic expansion of the risk difference. We obtain the corresponding result in the 



> ; ARMA model. 
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I. INTRODUCTION 



Let us consider a prediction problem in the Bayesian framework. Suppose that 
metric model 

M := {pix\e) : 6* e e C R'^} 

is given and our problem is to estimate p{y\6) itself from an observation x. If a proper prior 
density 7r{6) is known, the best predictive density minimizing the average risk is obtained 
by Q 

pAy\^) ■■= j p{y\e)T^{e\x)de. 

If one has no knowledge on the unknown parameter 6, he or she tends to adopt a noninfor- 
mative prior. It is often recommended to use the Jeffreys prior as a noninformative prior 
due to several reasons. However the Jeffreys prior is often improper, i.e., / TT{6)d6 = cxd. 
In such a situation, the above result does not hold any more and other noninformative pri- 
ors could be recommended. One of those is a superharmonic prior. Komaki showed that 
Bayesian predictive distributions based on a superharmonic prior asymptotically dominate 
those based on the Jeffreys prior He compared two prior distributions by using the 
asymptotic expansion of the risk difference. 

In the present paper, we extend this result to the ARMA process. We formulate the pre- 
diction problem of spectral densities in the ARMA model as described below and obtain the 
asymptotic expansion of the risk difference. Our conclusion is the same as in the i.i.d. cases. 
Since we used the properties of the ARMA model only when evaluating the expectation of 
the log likelihood, it can be expected that almost all our arguments hold true in general 
stationary Gaussian processes. 

A. General setting 

Let us consider a parametric model of stationary Gaussian process with mean zero. It is 
known that a stationary Gaussian process corresponds to its spectral density one-to-one (for 
proof, see, e.g., jlJl). Thus, we focus on the estimation of the true spectral density S'(a;|6'o) 
in a parametric family of spectral densities 

M := {S{u\e) -.eeec r^}. 
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The performance of a spectral density estimator Slu) is evaluated by the KuUback-Leibler 
divergence. 

do; I S{u\9o) I S{u!\9q 



-^47r [ S{uj) \ Siu) 

The above setting is proposed by Komaki 



B. Bayesian framework 

First, let us consider minimizing the average risk assuming that a proper prior density 
n(6) is known in advance. Aitchison's result |l| applies to this setting. The spectral density 
estimator minimizing the average risk, 

E^E^[D{S{u;m\S{u))] 

:= j dOTviO) j dx^..Ax^Pn{xu...,Xn\e)D{S{uj\e)\\S{u)), 
is given by the Bayesian spectral density (with respect to 7r(0)), which is defined by 

s^{u) := j s{u\e)'K{e\x)dd. (1) 

We call St^{uo) in a Bayesian spectral density even when an improper prior distribution 
is considered. 



C. Choice of a noninformative prior 

If one has no information on the unknown parameter 9, it is natural to adopt a non- 
informative prior in the Bayesian framework. There is much room to argue the choice of 
a noninformative prior. While the Jeffreys prior is a well-known candidate from several 
reasons, it can be expected that it is better to adopt a superharmonic prior in some cases. 
The reason is that stationary Gaussian processes are getting close to the i.i.d. cases as the 
sample size becomes large and a superharmonic prior can be better than the Jeffreys prior 
in the i.i.d. cases. 
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D. Construction 



In the following section, we briefly review basic results necessary to the asymptotic expan- 
sion. The asymptotic expansion of the posterior distribution is presented. In section 3, we 
obtain the asymptotic expansion of the Bayesian spectral density. For the ARMA model, it 
can be also written in the differential-geometrical quantities as in the i.i.d. cases. In section 
4, we evaluate the expectation of the KL-divergence from the true spectral density S{ll!\6o) 
to the Bayesian spectral density Sf up to the second order for an arbitrary prior (possibly 
improper) f{0). Finally, we obtain the principal term of the risk difference between Sf and 
St^j, where ttj denotes the Jeffreys prior. As a direct consequence of this result, a superhar- 
monic prior is recommended as a noninformative one if there exists a positive superharmonic 
function on the corresponding model manifold. 

II. PRELIMINARY 

A. Notation and assumption 

In the present section, we consider general stationary Gaussian processes with mean zero. 
We recall that the likelihood function is given by 

where X„ = (xi, ■ ■ ■ , Xn) and S„ denotes a covariance matrix and 9 E Q G R*^ denotes an 
unknown parameter. We often use the log likelihood of the form omitting the constant term 

Ue) = - ^iogdets„(^). 

We also assume that an arbitrary prior density 7v{9) on G is given. 



In the present paper, we assume several regularity conditions (See, e.g., Taniguchi 



and Kakizawa 



111)- 



Differential operators are denoted as dj := ^ as often seen^in the differential geom- 



etry (for other basic notation, see, e.g., Kobayashi and Nomizu j^). We also use Einstein's 



summation convention: if an index occurs twice in any one term, once as an upper and 
once as a lower index, summation over that index is implied. 



B. Asymptotic expansion of the posterior density 

For stationary Gaussian time series models, we have the asymptotic expansion of the 
posterior density 



/d9p„(X„|e),r(9)- 
Here, we give the moment form of the expansion. 

Lemma 1. 

The asymptotic expansion of the posterior density in the moment form is given by 
A9''^ ■ ■ ■ M'^T^{e\x)d9 



r 1 1 / ?? - - 

y A»" . . . AO-^ exp {- jje)(o - ef 



X <^ 1 + 



n 



+ Op(n-i)|d^, 



where 9 is the maximum likelihood estimate and 

1 dHM 



M':=9'-9\x), [J„ 



n 89^891 



(3) 



3! n 

For derivation, see, for example, Philippe and Rousseau .g]- From this formula, we can 
calculate £''^[A^'^ • • • A^*p] in an arbitrary order. The p-th moment is defined by 

1 1 



det \n~^Jn{9y 
xexp d2/^..d/ 



A (27r)'=/2 
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We need only the second moment P^{9) and the fourth moment r^^\9) in the present paper. 

Stochastic order of them is evaluated as (for even p) 
Using these moment formula, we obtain 

E^IAO'] = Bi{e) = (^d^id^sMi^^ {^fp^'^^^\e) 

+^di, \ogT:{e)^P-\e) + Op(n-2)(= Op{n-^)) 
E'^IM'M^] ^ P^0) + Op{n-'^) 

III. ASYMPTOTIC EXPANSION OF THE SPECTRAL DENSITY 

We consider a parametric family of spectral density 

M := {S{u}\9) : 6* e © C R*^, S corresponding to an ARMA process }. 

Let f{6) be an arbitrary prior distribution on and nji^O) be the Jeffreys prior. From now 
on, 00 denotes the true parameter. We consider estimating the true spectral density iS'(t<;|^o) 
itself instead of ^o- From the n data, Xn :— {xi, . . . , Xn) subject to an ARMA process, we 
construct the posterior distribution f{9\Xn) and the Bayesian spectral density Sf{u;) with 
respect to f{9) is given by 

^^(a;) := J S{u;\9)f{9\X^)d9. 
From the result of the previous section, we obtain 

Lemma 2. 

Let the maximum likelihood estimator ^ = + Op{n''^) given, then the Bayesian spectral 
density is evaluated as 

Sf{uj) = S{uj\9) + diS{uj\9)B){9) + ]^didjS{oj\9)P\9) + Op{n-^. (4) 
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Proof. 

Using the Taylor expansion of S{uj\9) around 9, Eq.(^ immeadiately follows from Lemmal. 
Q.E.D. 



A. Geometrical expansion of the derivatives of the log likelihood 

We rewrite the Eq.(jH) using geometrical quantities, which are defined by 



9i, = 1 log S{uj\9o)d, log S{uj\9o) 

'r\,, = Jt^^logS{u.\9of-§^ 



djdkdiS(uj\eo) 



^^ijM J 4^ 5(a;|6»o) S{uj\eo) 

T^Jk = J log S{uJ%)d, log S{uj%)dk log S{uJ%) 

Urn = /l^^aogS(a;|^o)a,log^(^|^o)^^|3g^ 

For these geometrical notations, e.g., see Amari . It is convenient to introduce some 
notation for the log likelihood /„(^o) and its derivatives. 



^-■■v^^o^ -nd9^^---d9^r,-^^^^> 



and 

mi^...i^{9) := Eg^[Li^...iJ 

Note that 9 ^ 9q, for example, mi{9) = E0i^[Li{9)] ^ but mj(^o) = 0. Likewise, the 
expectation of the product of the log derivatives are defined by 

We omit the argument if otherwise necessary. Other important notations are U'^{9) and 
m^^{9). Each of them denotes the inverse matrix of Lij{9) and that of mij{9). Note that 
■= {L-^)ij = m}^ - m}\5L)ikm^^ H , where {5L)ik = - mik{= Op{n-^)). 

Lemma 3. 
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For the ARMA model, we obtain the exphcit forms of mij^m}^ ^rriijk and rriij^k- They are 
represented by geometrical quantities. 



= O(n-i) = 0(1) 

m'^ = -g'^ + 0{n-^) = 0(1) 

(m) (m) (m) 

^ijk = 2Tjjfc — ( r ijk + r j,ifc + r k,ij) + o(n^ ) 

(e) (e) (e) 

= -{Ti,jk + TJ,^k + Tk,ij +T^jk) + O(n-i) = 0(1) 

(m) (e) 

nniij^k = T k,ij -Tijk + 0{n'^) =T k,ij +0{n'^) =0(1) 



(e) (m) 

where rijA::= F ijk —Tijk- 



Proof. 

First, we show rriij = —Qij + 0{n~^). From straightforward calculation, we obtain 



Here, the following fact holds for a parametric family of the spectral density of the ARMA 



model (See, Lemma 4.1.2 ll|). 
Fact, [nl 

Let pn{xi, . . . , Xn\0) is given by Eq.© and the corresponding spectral density is S{ll!\6). 
Then, for arbitrary A; > 1 and Pi, . . . ,Pk, 

1, 



n 



iv{E-i(£>„E)s-'-..(r>,.E)} 



1 rD„S(.m ^ 



2nj_^ s{uj\e) s{uj\e) 



where Dp denotes an arbitrary p-th order differential operator ^ 



Using the fact, the trace in the r.h.s. of (0) is rewritten in the form of the integral 



Thus, we obtain 

'^ij = -9ij + 0(n"^). 
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Since m*-^ is the inverse of rriij, the second equation clearly holds. The other equations are 

shown in the same way. 

Q.E.D. 



B. Geometrical expansion of the Bayesian spectral density 

Now we rewrite the asymptotic expansion (j3)) in the geometrical quantities. 

Lemma 4- 

Let 9 be the maximum likelihood estimate, then the following expansion holds. 

j'U9) i d.8.S(uj\9)- T 
2n^ 



Sfioo) = S{u\9) + ^g'^{9) { d,djS{uj\9)- ^t\^ {9)d,S{u\9) 



+-g''{9) |9.1og^(^) + lT,i9)\ d,S{u;\§) + Op(n-t), (6) 
n { TTj 2 } 

where Tj := Tijkg-'^- Note that Eq.® is formally in the same form as those in the i.i.d. 
cases if one reads p{y\9) as S{uj\9).{^ee, Komakipj). 



Proof. 

In order to prove Eq.®, one can neglect Op(n~t) terms. For example, up to this order, 
the following identity holds 

rU9) = --U^(9) = ~-m'^(9o) + OJn-h 
n n 

= V(0o) + O,(n-t) 
n 

= V(^) + 0,(n-i). 
n 



Now let us rewrite the principal term of diS{uj\9)B\{9) in Eq.(^. From Lemma 3, 



Ljki{9) = mjM{9o) + Op{n 2) 

(m) (m) 

'2Tjki{9Q) — ( r jM {9o] 



m) (m) ^ J 

r k,ji (^0)+ r i,jk i9o)) > + Op{n—2) 



(m) ^ (m) ^ (m) 

2Tjki{9) - ( r j,ki {9)+ r k,ji {9)+ r (9)) } + Op{n- 
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and 



2T,ki{e) - <^t\m {e)+ r\,,7 {e)+ ^Vi.k m^g''{e)g''\e) + Op{n-"^] 



Thus, 



(rri)* ^ , ^ (e)^ „ „ -, 



1 

'2' 

+g'^{mS{u\e) |a,log^(^) + ^T,(^)| + 0,(n-^). (7) 



In the last equality, we used the relation 



di log TTj = a, log = Ti^ = r +^r,. 



Substituting Eq.(|7j) into Eq.(jH), we obtain 



= siuj\e) + —g^^ie) \^d,d,siiu\6)- r {0)8^3 (uie) 

(a,,log^(^) + l-Tiie)\d,siu\9) + o,in-l). 



Q.E.D. 



IV. ASYMPTOTIC EXPANSION OF THE EXPECTATION OF THE KL- 
DIVERGENCE 

In this section, we evaluate the expectation of the KL-divergence up to the second order, 
focusing on the terms including the prior distribution f{9). For simplicity, we introduce the 
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following notation. 

So := S{u;\eo), S := S{u;\e), Sf := Sf{u) 



and 



So \ So J \ So 
While the first term depends on the prior / the second one is independent of /. Note that 
ASm '■— = Op{n~^) and ASf := — Op{n~^). The KL-divergence from the true 
spectral density S{co\9o) to a Bayesian spectral density Sf{u;) is given by 



" do; r 1 . 1 / 1 

- 1 - log 



where y — ASm + ASf. Due to the Taylor expansion, 

- 1 - log (j^) = -\y' + -y + --- = f:(-l)'^^2/^ 

l + y + 2 3 4 ^ ^ 

we obtain 

idcj 



D{So\\Sf) = l||(A5j^^ + 2 j AS^AS;^ + j{ASf) 

{AS^f^ + Z j {AS^fiASf)^ 

= U+\v-2W 

+ the terms independent of / + Op(n~^). 

We consider the expectation of the following three terms including /: 

U = jAS^ASf^{=Op{n-l)), 
V = j{ASff^{=Op{n-^)), 
W = /(A5j2(A5^)g(= 0,(n-2)). 

Both ASjn and ASf are given by 



2 

47r 
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where 

5^ := - 9}, and F,(^) d, log ;f (^) + ^r,(^). 
It is convenient to use some formula for the maximum likelihood estimate 9. 

A. Evaluation of V and W 

First of all, wc calculate V and W^. We need the principal terms of each. In the principal 
order, one can replace 9 with 9o. Each quantity is evaluated at the point 6*0, but for simplicity, 
we omit 9o. 







( 


X < 







SI" s 

1 (m)' (m)' 



In the same manner. 



n^W = J n{ASmfinASf) 



= ^gi^g'^^{L,j,ki- ^r\i \T,jm) + \g'^ g""' FiT,^^ + 0,{n-^. 
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B. Evaluation of U 



For U, we need to evaluate terms up to the second principal order. However, when it 
comes to the expectation, it is not so difficult. For simplicity, we set 

A := (dkdiS- F « 8^3] + g'%S Fi. 

We decompose U into the following three terms: 

1. Evaluation of U2 and U3 
Straightforward calculation yields 

1 f 1 . . „ 1 . . M m'' 
1 ■ • , , ("^) 1 5 

+ 2^V ri,ijF,j+0,{n-^) 

and 

[/. = 5^^^- f—( 

' J A7r\ So So J 

1 / (™) (m)''^ (m)''^ \ 

r,,,F,,/^ + ^.,a,(/^ Tjj 

1.1,, ,, (m) (m) (m)^ (m)*^ \ 

= ^29''[9 M^,l,, + ^,g^' T^,l,- T^,kT,,g'''' + g^,^,ig^' Tjj 
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2. Evaluation ofU\ 



Finally, we deal with Ui. 



-F, + + Op{n-^)] 6\ 
n J 

^ 3 

where Tj denotes Op{n~^) terms. Thus, the stochastic expansion of the Ui term requires the 
second order of the asymptotic expansion of the posterior density. However the evaluation 
of the expectation requires no such higher order terms because E\Ti5'^] = 0(n~2) in spite of 
Ti6' = Indeed, 

E[5%] = E[{5' - E[5']) X Til + E[d'] x E[Ti] 
= 0{n-^) ■ 0(n~i) + 0{n-^) ■ 0(n-5) 



Thus, we obtain 



E[Ui] = ^F,E[6'] + E[TiS']0{n-l) 
1 / 1 ■ \ 

= ~,Pi% 9" + 0{n-i). 



{n 2) 



C. Asymptotic expansion of the expectation of the KL-divergence 

Collecting the whole terms U y and W , we obtain 
Ee,[D{S,\\Sf)] 

I {™)* , . 1 • • (™)*^ 1 I , , ("*)•' , , 

= -^^^ r g'^ + r + - r + a.i^'^^F,) 

II 5 

H -g'^FiF-j -g^^FiTk + the terms independent of / + 0(n~^' 

= :^g'^F,F, + 1 /^F, + ^dJg'^F,) 



5 , 



+the terms independent of / + 0{n~^) 

lie 5 

—g'^FiFj + — Vfe + the terms independent of / + 0(n"2). 

14 



Summarizing this, we obtain the following proposition. 



Proposition 1. 

Let 5*0 a true spectral density and Sf the Bayesian spectral density with respect to f{0). 
Then, the asymptotic expansion of the expectation of the KL-divergence from 5*0 to Sj is 
given by 

Ee. WS,\\s,)\ = (a. log J- + ir.) {s, log L + ir,) 
+ iv. (/^a,iog^ + lT,) 

+the terms independent of / + 0(n~2). 



V. COMPARISON BETWEEN S^j AND Sf 

From the result in the previous section, we obtain the same result as that in the i.i.d. 
cases, which was shown by KomakiQ]. Let us calculate the risk difference between two 
Bayesian spectral densities, one of which is based on the Jeffreys prior 7rj(^) and the other 
is based on an arbitrary prior /, 

n^Ei,,\D(Sa\\S^,)] - r^Es„\D(S„\\Si)] 

= i9-'r.T,+ V. (/^ir,) - (a, log {- + ir.) (a, log ^ + It,) 



- V. j/' (dj log i- + ly^) I + o(„-i; 



i,«(a.iog^) (a,.og^)49«T.a,iog-^ 



_v.{««(a,iog-^)} 



0{n 



~g'^ (^d, log -^^ (^dj log - Vfe |/^' (a, log ^ ) K O(n-t) 
-^A^ + g'^ (d,\og^] log ^) +0(n-i) 

^^^^^Yaaog^) fa.iog^) -^A^ + o(n-^). 
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In the above calculation, we used some formulas with respect to the Laplace-Beltrami 

operator (see Appendix D.). 

Thus, we obtain the following theorem. 



Theorem 1. 

For the ARMA models, if there exists a superharmonic function h{9) such that Ah < 
and h > 0, then up to the second order, one can improve 5*^^ based on the Jeffreys prior by 
adopting the superharmonic prior tth{9) := irj{9)h{9). 

VI. SUMMARY 

In the present paper we obtain the asymptotic expansion of the risk difference of the 
KL-divergence in the ARMA model. If there exists a superharmonic function h{9) on the 
corresponding ARMA model manifold, it is better in the Bayesian framework to adopt 
a superharmonic prior nH{9) := 7Tj{9)h{9) as a noninformative prior. It is because that 
Bayesian spectral densities based on a superharmonic prior asymptotically dominates those 
based on the Jeffreys prior in evaluating the averaged Kullback-Leibler loss. 

It is shown that there exists a superharmonic prior for the AR(2) process and the MA(2) 
process Q]. The explicit form of the superharmonic prior is also obtained and the numerical 
simulation ensures our theorem The existence of superharmonic priors for the higher 
order ARMA(p,g) processes (p + g > 3) remains to be discussed. 
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APPENDIX A: ASYMPTOTIC EXPANSION OF 6 

In the present section, we evaluate Ee^ld^] up to Op{n^^).The key equation is as follows. 

dm . dH{9,) , dH{9,) 1 , , 

We set Lm '■= etc. We omit 6*0 in the remainder of this section. Since 5^ = Op{n~^), 

higher order terms are recursively obtained and 
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where -U'^Lm = Op{n-^) and L^^Lkij{L'"'L„,){L^''Ln) = Op{n-^). Now we evaluate E[6'] 
up to Op(n-i). Since V^LujiL'^^LmWln) = Op{n-^), 

E[L''^Lkij {V^'Lm) {V'^Ln)] 

= E[L'^]E[Lkij]E[U^]E[U^]E[L^K] + 0{n-i) 

Note that some identities E[Lm] = 0, ii^[LmL„] = —^E[Lmn]- Let us denote := 
E[Lij\, = E[U^] + Op{n~^) etc. and m'-' be the inverse matrix of rriij. Note that 
E[U^] = m'^ + 0(n-5). Thus, 

= -—m^''nikijm'"'m^''{-mmn) + 0{n-^) 

1 3 

2n 

Collecting the all terms, we can rewrite E[5% 

E[5'] = E[-U^L^-^L"'Lk,j{V^L^){U^Lr,)] + 0{n-'i) 
= -E[V^Lra] + ^w}''mu^jm'' + 0{n-i). 

Here, the first term is 0{n~^) and we need to evaluate the second principal term. 

Ep'^Lrr,] = E[{rre^-m''5Likml'^ + 0p{n-^)]Lrr,] 
= rre'^EiLm] - rr^' E[5LiuLm\m^^ + O(n-i) 

1 3 

= m''-mik,mm''"' + 0{n~^), 

n 

where mik^m '■= nEldLikLm]- Thus, we obtain 

11 3 
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APPENDIX B: EXPLICIT FORM OF THE EXPECTATION OF THE LOG LIKE- 
LIHOOD FOR THE ARMA MODEL 



In this section, we calculate rriij, rriijk and rriijk for the ARMA model. Up to 0{n 

(m) 

it can be written in geometrical quantities gij,Tijk and r ijk, which are defined by spectral 
density S{u\9). 



1. Trace formula 



Before going into details, we mention the trace formulas. Suppose that {xi}i=i subject 
to a stationary Gaussian process with zero mean, i.e., 

{xi, . . . ,Xn) ~ A/'(0,E). (The (s,t)th component T^st depends only on s — t and such a 
matrix is called a Toeplitz matrix.) Then for any symmetric matrices A and B, the following 
equations hold. 

E[X,A,jXj] = Tr[EA], 
E[XiAijXjXkBkiXi] = Tr[^E]Tr[5E] +2Tr[AEEE]. 



2. nriij := Eog[Lij] for the ARMA model 



In this subsection, we calculate rriij. 



We set 



89' 89^ 89^ 89' 

+ -Tr E"^ — E-^— - Tr E"^ 



2 V 99' 89^ J V 99'89^ 



' 2n{^ 89' 89^^ 89'^ 89^^ 89 89'^ ) 



■ 2n V 89' 89i / ' 2n \ 89'89i 
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Taking expectation of Lij{:^ TlWak) ^ ^'n^ij^n + J'ij - hij, we obtain 

= E[X'^SijXn + J'ij - hij] 
= TrSijJ: + {J-j - hij). 

The first term is rewritten 



TrSijJ: 

-TrI E-^^ - Te-^^E-^^ + E-i^E-^^ 
2n \ d9W^ V 99^ 89^ 89^ 89' 

hij - 2J'ij. 



Thus, 



rriij = {hij - 2 + ( J- • - h^ 



-J' 



3. rriijk ■= EQg[Lijk\ for the ARMA model 



In this subsection, we calculate m^fc. Using the notation in the previous section, L^ij is 
rewritten by 

1 8 
Putting the second term and the third term together, we obtain 

A/. -Aft,, 



2n V 89''89' 89^ 89' 89W 



"2^-^'^ d9>^ 99^ 991 +^ d9^^ 89^^ 89J 
-^Tr rE-^^U ^Tr fE-I^E- 



2n V 89^89'89i ) 2n \ 89^ 89'89^ 
(r' -i-T' -i-T' ) — (T' -\-T' ] — N' 

\^ jki ~ ikj ~ kiji kij ~ ikj / ^^kiji 



where 

/ 1 f}E a^E \ 

^feii- "^^y^ QQk^ 89^89^ )' 
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QqU^ QQl^ Q^), 



and 



Since the first term is in the tedious form, we introduce the following notation: 

Ajk + (permutation terms) := Aijk + Ajfej + Ajfci + A^^k + -4^^^ + A^ji, 
Bijk + (cychc terms) := B^jk + -S^fci + Bkij. 



Then, 



^ - Aq 



= 2^ m>^ + (permutafon terms)| 



] 



1 If (9^S \ 

+(cyclic terms) + - (^E-g^jg^E- j . 



Taking average, we obtain 



— TVjS'fcijE 



1 _ r i^E^ ,aE_ ,aE\ , . ,1 

= 2^1^ I a^^^^^^^ J + (permutation terms)| 

-^Tr I ("s-^S-^ + S-^S-^^ 
2n \V de^dO' 80-' 80' 89'' 80^ J 

— ^jfe + ^ifei + ^feij + '^jik + ^fej + "^kji ~ '^(^ijk + + F'^jj) + A^j(. 



kij 

ijk- 

In the last fine, the cychc property T^j^ — Tj^^.^ = Tj^^^j, etc. was used, which is due to the 
property of the trace operation TrABC = TrBCA = TrCAB. 
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Thus, rukij is written in the form of 



nikij :— E[Lkij] 

+ + ^'jki + ^ikj) ~ C^kij + ^fcj) ~ ^ijk 



4- ?T^ij,A; := -E'eol-^ii-^fc] for the ARMA model 
In this section we calculate rriij^k := E0^^[LijLk]. 



Lij — X^SijXn ~^ Jij hij 



where 



Thus, 



1 fx'U-^B^Ax„-TrU-^'>^ 



X'nBkXn — ElX'j^BkXnl, 



" 2n\ Kde'' 



^^ij,k 



E[{X'^S,jXr, + 4 - h,j){X'^BkX^ - Tr[i?,E])] 
Tr[SijE]Tr[Bk^ + 2Tr[Sij^Bk^ - Tr[Sij^Tr[Bk^ 
2Tr[5i,ESfcE] 
2Tr 



2n I 



i/J-IvrE-^E-^ 
n I 2n V a^^a^^ 5^^= 



-J-Trl^E-^E-^E-^ 
2n \ de^ 89' 89'' 



n 
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Now we summarize the whole results, 



T^ij = ~ Jij , 

'rriij^k = ^(X'kij ~ '^ijk ~ '^jik)- 



APPENDIX C: MOMENT FORMULA 

Let us calculate the p-th moment of the multivariate Gaussian distribution by using the 
characteristic function (chf). We set a mean parameter equal to zero and denote a covariance 
matrix as S. Then, chf is given by 

^{t) := i?[e^*'^] = exp (-^t'St) 

where t = [t^ ■ ■ ■ t^y.Foi even p, 

■ • • d d d /I 

r^''-'^ = (iYi^i^ ■ ■ ■ -— exp — t'St 



Note that for odd p, the moments vanish. 



APPENDIX D: LAPLACE-BELTRAMI OPERATOR 

We briefly summarize the Laplace-Beltrami operator in the Riemannian manifold. (See, 
e.g., Kobayashi and Nomizu j^). The covariant derivative in the j-th direction of a vector 
is defined by 

v,y' := djV^ + 

where dj denotes When we set = V^cj) = d^cf) = g^^dicj) for a scalar function 0, the 
Laplace-Beltrami operator is defined by 

yd 
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Denoting g := det^j^, and since ^ = g'^^djgki, 

r}i = ^g'^idigkj + d^g^ - dkgji) = -g'^djgki = 
Thus we can rewrite r*j — -j=dj{^/g) — dj log(y^) = dj logTTj. 
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